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Abstract 


A. 


'^The  exact  distribution  of  the  sample  mean  from  a  double  exponential(Laplace)  model  is 
derived.  A  classical  subset  selection  procedure  based  on  the  sample  mean  for  selecting  the 
population  associated  with  the  largest  location  parameter  of  k  double  exponential(Laplace)  dis¬ 
tributions  is  studied.  For  the  case  when  a  non-informative  prior  is  introduced  into  the  problem, 
the  relation  between  the  classical  Maximum-Type  Procedure  Rule  and  the  so-called  Bayes- 
P*  subset  selection  procedure  rule  is  studied.  An  improved  bound  for  the  guarantee  probability 
of  a  correct  selection  for  the  classical  subset  selection  rule  R*"**  that  relates  the  rule  f?"***  to  the 
selected  subset  size  (notice  that  the  subset  selection  rule  R"***  may  select  all  the  populations) 
is  studied  and  some  improved  rules  of  the  type  R*"**  are  provided.  ^  ^  ^ 


Introduction 


7  .''.x. 


A',  ... 


Suppose  we  have  k  double  exponential  populations  ni,n2,*--,njfe,  where  each  Ilj  is  characterized 
by  the  location  parameter  i  =  1,2, The  parameters  are  assumed  to  be 

unknown.  Let  Xi  be  the  observable  random  variable  from  H,-  with  probability  density  function 


/(x;^„<r)  =  ^exp{-^— — -oo  <  x,0i  <  oo,  <r  >  0, 
J  o 


(1) 


‘Research  supported  in  part  by  the  Office  of  Naval  Research  Contract  N00014-88-K-0170  and  NSF  Grant  DMS- 
8702620  at  Purdue  University. 


where  <t,  Is  a  common  known  value  for  all  t  =  1, 2,  •  •  • ,  A;,  so  that  without  loss  of  generality,  we  can 
assume  that  a  =  1.  The  ranked  parameters  are  denoted  by  <  9|2]  ^  3'Qd  it  is  assumed 

that  the  correct  pairing  of  the  ordered  ^^,^’s  and  the  unordered  9,-’s  is  unknown. 

In  this  paper,  we  are  mainly  interested  in  the  subset  selection  procedures.  First,  we  assume  that 
there  is  no  prior  information  about  the  parameters.  Then  we  study  the  case  where  ^,’s  are  inde¬ 
pendently  distributed,  and  each  $i  has  a  non-informative  prior. 


2  Distribution  of  the  Sample  Mean 


In  connection  with  the  selection  procedures  based  on  the  sample  means,  we  first  derive  the  distri¬ 
bution  of  the  sample  mean. 

Let  Xij  be  a  random  sample  from  ith  population  i  =  1, 2,  •  •  • ,  A:,  j  =  1, 2,  •  •  • ,  n,  i.e. 

Xij  ~  f(x\ei,l)  =  |cxp{-|x  -  di|}. 


Hence 

Uii  =  Xij  -0i^  /(x|0, 1)  =  |cxp{-|x|}.  (2) 

From  the  characteristic  function  oiUi~Yl  Uij,  we  can  derive  the  following  lemma 

Lemma  2.1(Weida  (1935))  Suppose  Ui  =  where  Uij  has  density  (2),  then  the  density 

function  of  is  given  by  following  formula 


~  27r^"'*  (n  -  1)!  dt^-^  ^(1  -I- 


(3) 


where  u  >  0  and 

p(u)  =  p(— «)  for  «  <  0. 
Let  s  =  —it,  then  (3)  becomes 


P(«) 


1 


e* 


(n-  I)!*"-*  (1-s)^ 


L 


i=i 


fniliziun-i 
Cn  ’ 


□ 


(4) 


2 


where 


c„  =  2>-l)!, 

Therefore  the  density  function  of  JT,-  =  J2j=i 

/n(l®  -  -OO  <  X  <  OO,  (6) 

i=i 

where  On.n-j  =  J  —  1»2,  ■  •  • ,  n. 

To  obtain  the  coefficients  KJ,  *  =  0,l,2,--.,n  -  1,  n  =  2,3,---,  it  is  helpful  to  rewrite  the 
formula  (5)  as 

(2n-i-2)! 

’*•’  (n  -  t  -  l)!t!2’*-*-^ 


Note  that 


In  particular 


_  (2n  -  i  -  2)(2n  -  t  -  3) 
Cn.n-1  -  1.  Cn,i  -  2(n  -  t  -  1) 


Cn— 1,«’ 


Cn,0  =  Cn,l»  C„,i  =  (2n  -  3)Cn-l,l. 

In  Table  1,  we  have  provided  the  values  of  {cn}  and  {c„,,}  for  n  =  2(1)10;  t  =  l(l)n  -  1. 

To  find  the  cdf  of  Xj,  let  us  first  find  the  cdf  of  Ui.  Integrating  the  density  function  (4)  of  £/.,  we 


P(«)  =  r  p(t)dt  («>0) 

J— OO 


j=l 


where  {an,n-j}  satisfy: 


=  Cn.Ti-j  +  (n-j  +  l)an,n-j+i  j  -  1, 2, •  •  • , n,  a„,„  -  0. 


Again  we  have 


fln.n-l  =  1*  0„,n-2  =  (n  ~  !)(»  +  2)/2. 
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Hence  the  cdf  of  X,-  is  given  by, 


Fn(x\9i) 


1  _  e-n|x-fl.|  \x  -  Otherwise. 


In  Table  2,  we  provide  the  values  of  {c„}  and  {an,i}  for  n  =  2(1)10;  i  =  l(l)n  —  1. 


(11) 


Example:  If  we  want  to  obtain  the  density  and  the  cumulative  distribution  function  of  the  sample 
mean  of  size  n=4  from  a  double  exponential  model,  checking  the  column  n  =  4  from  both  Table  1 
and  Table  2,  we  can  easily  see  that 

U(\x  -  (9.1)  =  1(4^1*  -  +  6  X  4®|a:  -  (9.|2  +  15  x  4^\x  -  Oil  +  15  x  4), 

yo 

and 


f  ^«“'‘'*“'’‘'(4^|a:  -  +  9  X  4^\x  -  +  33  x  4\x  -  Si]  +  48),  x  <  0.-, 

[  1  “  -  9i\^  +  9  X  4*jx  —  +  33  X  4|x  -  fl,|  +  48),  otherwise. 


To  compare  the  percentage  points  of  the  sample  mean  and  the  sample  median,  let 


x„- 


Since  the  cdf  of  Z'!^  for  odd  number  n  is  much  easier  to  derive  (see  Gupta  and  Leong  1979),  we 
will  only  provide  the  comparison  of  the  percentage  points  of  Zn  and  Z^  for  n  =  3, 5,  ■  •  • ,  21(Table 
3).  The  percentage  points  for  the  distribution  of  the  sample  mean  when  n  =  2, 4, •••,20  are 
provided  in  a  separate  table(Table  4). 


3  Using  the  Sample  Mean  to  Select  the  Largest  Location  Pa¬ 
rameter 

If  we  assume  that  no  prior  information  about  the  parameter  2,  =  (^i,  ^2,  •  •  • ,  ^fc)  is  a  -liable,  then  we 
usually  will  use  either  the  classicad  subset  selection  approach  or  the  indifference  zone  formulation  in 
our  ranking  and  selection  problem.  In  the  following,  we  only  study  the  subset  selection  approach. 
(A)  Formulation  of  the  Problem:  The  the  classical  Maximum-Type  Approach  for  any  location 
type  problem  have  been  well  studied,  so  we  would  not  give  too  many  details,  but  simply  state  some 
interesting  results  without  any  proof. 
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For  selecting  the  population  associated  with  the  largest  location  parameter  with  a  correct  selec- 
tion(CS)  probability  at  least  P*{l/k  <  P"  <  1)  from  k  doable  exponential  populations,  where  we 
have  a  sample  mean  Xi  of  size  n  from  each  H,- 1  =  1, 2,  •  •  • ,  If,  the  Classical  Maximum-Type  Subset 
Selection  Rule  proposed  by  Gupta(1956)  is  defined  as  follows: 

^max  .  Select  n,-,  iff:  lCi>  m^^Xj  —  d/y/n  for  some  d(>  0), 

where  d{>  0)  is  the  smallest  value  satisfying: 

r  +  d/Vn)/n(u)du  >  P\ 

7—00 

The  usual  condition  of  P(C5|iZ““)  >  P*  is  guaranteed  by  the  following  theorem: 

Theorem  3.1 

inf  =  inf  Ptf(C51R“**)  =  r  Ft'(u  +  d/y/K)Uiu)du, 

S.€0  —  £€0a  ~  J—oo 

where  D  fto  =  {£  :  9i  =  $2  =  •  ’  ‘  =  -oo  <  ^,  <  oo,  t  =  1, 2,  •  • ,  fc  }. 

(B)  Table  of  Necessary  Constants  For  R®**:  for  given  k,  n,  and  some  particular  values  of 
P*,  the  constants  d/y/n  =  d{k,n,P*)  which  satisfy 


p*  =  r  p*-»(«+d/v/H)/„(«)d«, 

7—00 


are  given  in  Table  5. 


(C)  Asymptotic  Results  for  the  Procedure  R™“:  For  large  n,  we  can  certainly  use  the  normal 
distribution  to  approximate  the  infimam  of  P£(CS|R““).  Since 

inf  PsiCS\R^)  =  m^f  P^(C5IR““), 

it  suffices  to  consider  the  case  where  £  €  flo»  now  we  have 

<^n 

where  a*  =  2/n,  so  the  probability  of  the  following  event 


Xfc  >  max  Xj  -  d/\/n. 
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is,  asymptotically,  the  same  as  that  of 


Zk  >  majc  Zi  —  (f/>/2, 

“  i<i<* 

where  Zj,  j  =  1,2,’  •  •,fc  are  i.i.d.  standard  normal  variables,  thus 

infPj^(C5|/Z”“)  ~  P,(Z*>  m«Z,  -d/v/2) 


=  n  $*-*(«  +  d/N/2)d$(u). 

J^QO 


On  the  other  hand,  if  we  use  the  sample  median  in  the  selection  procedure,  we  will  have,  asymp¬ 
totically, 

inf  IP^ian)  Pt(Zk  >  m^^  Zj  -  dmedian) 


=  /  + ‘^medton)d^(«)- 

•/— oo 


Thus,  in  order  to  have  the  same  probability  of  a  correct  selection  for  both  selection  rules  based  on 
the  different  statistics,  we  must  have,  for  large  n, 

d  V^dniedtan*  (13) 

(D)  Sensitivity  of  the  Assumption  of  Double  Exponential:  Suppose  we  have  k  populations 
Hi,  IIj,  •  •  • ,  life,  where  11, •  is  characterized  by  a  location  parameter  $i.  If  we  do  not  know  whether 
these  k  populations  have  normal,  logistic,  or  double  exponential  distributions,  then  selecting  the 
population  associated  with  the  largest  location  parameter  becomes  a  problem,  because  the  real 
distribution  of  the  populations  is  unknown.  We  will  show  that  the  double  exponential  distribution 
model  provides  a  safeguard  as  explained  below. 

If  the  sample  size  n  is  large,  we  know  that  the  infimum  of  Pg^(C5|ii““)  for  the  double  exponential 
populations  is  approximately  ^ven  by  (12).  On  the  other  hand,  for  the  normal  means  problem,  we 


because 


inf  Pt{CS\RT^)  =  d^)d$(u), 


Zk  >  m^  Zj  —  dff/y/n 
1 


y/n(2k  -9)  >  y/n(2j  -9)-  ds, 
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and  —  0)  ~  N{Q,  1),  Similarly,  for  the  lo^stic  distribution  model,  we  have 

infPg^(C5|iZr‘)=^  r  $*'■'(«  + dL)d$(u), 

therefore, 

d  ~  Vldfj  ~  V^di. 

It  is  clear  from  the  above  that  the  d-values  for  the  double  exponential  provide  conservative  bounds 
for  the  other  two  models,  if  n  is  large. 


When  n  is  small,  for  instance,  for  n  =  10,  k  =  10,  we  have  the  following; 


P*-value 

0.75 

0.90 

0.95 

0.99 

d 

3.1971 

4.2510 

4.9063 

6.1968 

dL 

2.2639 

2.9925 

3.4390 

4.3029 

ds 

2.2637 

2.9829 

3.4182 

4.2456 

AT- value  excerpted  from  Bechhofer(1954) 
di,- value  excerpted  from  Han(1987  Ph.D.  Thesis) 


From  this  we  again  see  that  the  d-values  for  the  double  exponential  provide  conservative  bounds 
for  the  normal  and  logistic  models  for  the  problem  of  selecting  the  unknown  location  parameter. 


4  Selection  Using  a  Non-informative  Prior 

In  the  Classical  Maximum-Type  Subset  Selection  Procedure,  it  is  easy  to  notice  that  the  selected 
subset  size  |s|  is  a  random  variable  which  is  not  fixed  in  advance. 

In  general,  for  any  location  or  scale  parameter  situation,  Gupta(1965)  proved  that: 

(1)  The  procedure  of  the  above  type  is  monotone,  and 

(2)  If  the  distribution  F{x,  9)  possesses  a  density  /(i,  9)  having  a  monotone  likelihood  ratio  (MLR) 
in  X,  then  P(|s|)  is  maximized  when  9i  =  9^  =  •  •  •  =  9k  and  the  maximum  is  kP*. 

So,  in  the  worst  case,  the  expected  proportion  in  the  selected  subset  is  equal  to  P*.  Furthermore, 
it  may  select  populations  such  that,  depending  on  the  unknown  parameter  we  may  get  an  actual 
P{CS)  much  larger  than  P*. 


In  this  section,  we  will  regard  the  likelihood  function  of  9i  as  the  distribution  of  Oi  given  It  is 
the  same  as  saying  that  based  on  the  distribution  of  a  statistic  (in  our  case  it  is  the  sample  mean 
and  the  sample  median),  we  assume  that,  independently,  each  @,-  has  a  non-informative  prior, 
i  =  1,2,  •••,*. 

4.1  Bayes  Selection  Procedure 

In  the  following,  we  will  consider  a  more  general  case,  we  assume 

X.-/(|x-e.|), 

i.e.  the  density  of  Xi  given  0<  =  is  symmetric  about  dj(for  the  case  where  /(.)  is  not  symmetrical, 
we  have  obtained  some  results  which  will  be  available  later),  and 

0i  ~  n(d)  =  1,  t  =  1,2,- 

Now,  we  will  make  decisions  based  on  the  posterior  distributions  of 

From  a  Bayes  perspective,  in  order  to  select  the  population  associated  with  the  largest  parameter 
6[k]  with  a  guaranteed  posterior  probability  of  a  '•orrect  selection  to  be  at  least  P*{l/k  <  P*  <  1) 
(the  so-called  PP* -condition,  see  Gupta  and  Yang(1986)),  we  should  consider  the  following  events 

At  =  {9i  is  the  largest  |2C  =  £}.  *  =  1, 2,  •  •  • ,  A?. 

Now,  using  the  non-informative  prior,  we  have 

0i|2C=a.~/(l*<-0.i),  *■=  1,2, •••,*. 

Let  p,'(2.)  be  the  probability  of  event  then 

P«(aO  =  P(^«  is  the  largest  \s) 

=  P{9i  >  9jt'ij,  j  #  »  li) 

=  P{9i  -Xi>  9j  -  Xj  -  {xi  -  X,),  Vj,  J  7^  1  Iz) 

=  /  n  +  (*•  ~  »>))/(«)<!“» 

where  P(.)  is  the  cdf  of  /(.), 
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Lemma  4.1:  (1)  The  posterior  probability  pi(£)  depends  only  on  the  differences  x,-  —  Xj,  i,j  = 
1,2, j  5^  i, 

(2)  Pi(gi)  is  non-increasing  in  Xj,  j  /  t,  keeping  other  components  of  3l  fixed  and  it  is  non- 
decreasing  in  X,-,  keeping  other  components  of  x  fixed. 

(3)  Pt(£)  >  Pj(sO  if  and  only  if  Xj  >  xj. 

Proof:  The  proof  is  straightforward  and  hence  omitted.  □ 

Theorem  4.1:  For  any  subset  5  of  the  whole  populations  Hi,  112,  •  •  • ,  Hfc,  let  PP(C'5|5,x)  denote 
the  posterior  probability  of  a  correct  selection  for  the  subset  5(i.e.  the  subset  5  contains  the  best 
population)  based  on  a  random  sample  x,  then 

(1)  FP(CSIS,£)  is  non-increasing  in  xj,  if  j  ^  S,  keeping  other  components  of  2.  fixed,  and 

(2)  PP(C5|5,x)  is  non-decreasing  in  x,-,  if  t  €  S’,  keeping  other  components  of  x  fixed. 

Proof:  Since 


PP(C5|5,x)  =  52pi(£) 
ies 

-  1  -  E  ?.(£)•  (14) 

igs 

Now,  pj(2,)  is  non-increasing  in  xj,  if  j  ^  5  for  all  »  €  S',  so  PP(CSlS,s)  is  non-increasing  in 
3  ^  S  hy  first  part  of  equation  (14). 

On  the  other  hand,  the  second  part  of  equation  (14)  and  the  fact  that  Pj(£)  is  non-increasing  in 
Xj,  if  *  €  5  for  all  J  ^  5  imply  that  PP(^CS\S,s)  is  non-decreasing  in  Xj,  i  €  5.  □ 

From  the  Bayesian  analysis,  we  know  that  the  Bayes  Decision  Rule  (R^)  will  select  the  t  populations 
which  associated  with  the  t  largest  values  of  pi(2)  values  (i.e.  the  Bayes  set  =  {n[jb],  •  •  • , 

),  where  the  integer  t{>  1)  satisfies 

* 

m=fc— 1+1 

and 

k 

P[m](3t)  ^  » 

m=fc— 1+2 

where  P[i](aL)  <  P[2](at)  <  •  •  •  <  P[Jb](£)  fh®  ordered  values  of  Pi(2)’s,  and  s®  is  the  subset  selected 
by  the  Bayes  selection  rule  R®. 
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4.2  A  Lower  Bound  on  the  PP{CS)  for  the  Subset  Selection  Rule 

Under  the  Maximum-Tsrpe  Subset  Selection  Rule  defined  in  the  previous  section,  we  know 
that  the  larger  the  value  Xi  is,  the  larger  the  chance  that  the  corresponding  population  II,-  will  be 
selected. 

Under  the  rule  we  will  pick  the  population  11, •  if  its  Xi  value  is  larger  than  x[fc]  -  d,  and  reject 
n,-  if  Xi  <  X[;t]  -  d.  Thus  the  following  observations: 

Observations:  For  the  Maximum-Type  Subset  Selection  Rule  we  know  at  least  the  following 
two  facts 

(1)  will  always  pick  population  !!(*,],  i.e.  the  popidation  associated  with  the  largest  value 

(2)  All  n,-  not  being  selected  by  must  has  its  Xj  value  less  than  X[jt]  —  d. 

Theorem  4.2:  If  the  subset  selection  rule  selects  t  populations(i.e.  select  population 

•  •  • ,  II[jb-i+i],  where  n^-]  is  the  population  associated  with  the  jth  largest  value  X[j]),  under  the 
classical  selection  procedure,  then 


PP(C5|R“*,a)  >  P/»(C5|R““,a.e  Ah) 

=  +  (15) 


where  Ab  =  {a, :  xj*]  -  d  =  X[fc_i]  =  •  •  •  =  X(ij  }. 

Remsurk  4.2:  A  similar  result  for  the  normal  model  has  been  given  in  Gupta  and  Yang  (1985). 
Here,  we  will  give  a  probabilistic  proof  of  the  above  theorem. 

Proof:  The  first  part  of  the  inequality  [i.e.  PP(C5|R““,a)  >  PP(C5|R““',a  €  Ab)]  follows 
from  the  above  observations  and  Theorem  4.1. 

When  a  €  /Vo,  we  know  that  P[*](a)  >  P[a-i](£)  =  •  •  •  =  l>[i)(a)»  and 


PfiklCaO 


I  n  “  »Ul))/(“)‘^“ 


llF(u  +  d)fiu)du 
P*-*(tt  +  d)/(u)du  = 


P*, 


since  Ep,(a)  =  L  so  P[ij(a)  =  P[7](i)  =  •  •  •  =  P[*-ij(a)  =  iir(l  ~  =  *»  hence  the 

result  follows.  □ 
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Since  PP(CS\R°^,s)  >  P*  and  it  is  strictly  larger  than  P*,  once  we  pick  more  than  one  popu¬ 
lation,  we  certainly  can  find  a  better  subset  selection  rule  by  simply  utilizing  the  lower  bound  on 
PP(C5|P”“,£). 

4.3  Some  New  Selection  Procedures 
First,  let  us  consider  the  following  selection  procedure: 

Let  Ax[ij  =  X[k]  -  for  *  =  1, 2,  •  •  • ,  fc  -  1,  where  <  X[2)  <  •  •  •  <  X[fc]  are  the  ordered  values 
of  x,’s.  then,  we  compute  the  following  k  —  1  numbers: 

= r~  + AxH)dP(ii).  (16) 

J  ^OO 

Since  0  <  Ax^ij  <  •  •  •  <  Ax[fc_i],  therefore 

o<P(\)<P(*2)<-”<prfc-i)(<  !)• 

Next,  we  compute: 

Lemma  4.2:  For  values  of  Ax[,-],  where  0  <  Axji)  <  •  •  •  <  Axjfc.j],  we  have 

0<Qri)<.  •  <  1).  (18) 

Proof:  Actually,  we  have 

so  is  increasing  in  i,  because  —  i  is  decreasing  in  i  and  1  —  P^*j  is  decreasing  in  Axj,-]  (thus  in 
*);  hence  the  result.  CD 

Now,  we  propose  the  following  subset  selection  rule  Pi: 

For  any  preassigned  guarantee  probability  P*(l/ib  <  P*  <  1),  if  there  exists  the  smallest 
which  satisfies  >  P*,  then  the  subset  selection  rule  Pi  is 

Pi:  Select !!(,)  iff:  J  >  to.  (19) 

The  subset  selection  rule  Pi  will  take  s  =  {!!(*),■  ••jlljjt-to+i)}  m  our  selected  subset. 
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otherwise,  Ri  will  select  all  populations. 

Remark  4.3:  To  implement  the  procedure  Ri,  we  examine  the  posterior  probabilities  at  following 
k  -  1  stages: 

Stage  1.  pull  all  fc  —  2  values  of  i  =  1, 2,  •  •  • ,  fc  -  2  to  the  point  X[fc_i],  and  check  if 

^(1  -  />,•„)  <  1  -  J". 

if  the  above  holds,  we  select  a  =  and  terminate  the  process,  if  not,  we  go  to 

Stage  2.  pull  all  fc  —  2  values  of  xj,-],  t  /  fc  —  2,  i  =  1, 2,  •  •  • ,  A:  -  1  to  the  point  X[fc_2],  check  if 

Ifid -/>,•„)<  i-i-, 

if  it  holds,  we  select  a  =  {Xlffc],  n[fc_|]}  and  terminate  the  process,  if  not,  we  go  to  Stage  3,  and  so 
on,  until  we  can  find  an  i  such  that 

^(1  -  JTo)  s '  -  'P’’ 

and  then  we  select  a  —  {Iljfc],  •  •  •,  n[fc_i+i]};  If  there  does  not  exist  such  an  t,  we  select  all 

populations. 

For  other  subset  selection  rules  iil!2,  •  •  • ,  Rfc-i,  we  give  the  following  remark: 

Remark  4.4:  Note  that  in  the  Process  of  deriving  the  subset  selection  rule  Ri,  we  divided  the 
data  into  two  groups,  and  put  only  one  value  (i.e.  xj^j)  into  the  first  group.  Now  we  can  develop 
it  in  two  directions. 

(a)  By  putting  more  xj^j’s  into  the  first  group,  we  can  actually  replace  by  as  follows: 

gr,  =  max  (1  -  ,  ~  ,  Rum))r 

where 

/+00 

-  (*(fc-mi  -  xifc_.i))dF*-"-i(«),  m  =  0, 1,  -  1, 

-OO 

is  the  posterior  probability  of  P[i](x)  =  •  •  •  =  P{fc_m-i)(£),  when  we  pull  X[fc],  •  ■  • ,  X[fc-m+i]  to  X[fc_,„] 
and  X{fc-m-i],  •  •  • ,  X{i5  to  X[fe_,-]. 

When  m  =  0,  we  have 
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which  is  the  value  we  used  in  the  rule  Ri. 

(b)  We  can  also  divide  the  data  into  groups.  Let  R2  be  the  rule  for  the  case  of  3 

groups,  •  •  •,  and  Rk~i  be  the  rule  for  the  case  of  k  groups,  then  in  the  case  of  k  groups,  the  subset 
selection  rule  and  the  previous  rule  R^  are  identical.  Later,  it  will  be  shown  that  R2  can  be  as 
good  as  R^  and  it  is  easier  to  implement  from  the  computational  viewpoint. 

4.4  Properties  of  Subset  Selection  Rule 

We  can  easily  prove  the  following: 

Proposition  4.1:  The  subset  selection  rule  i?i  is  better  than  rule  i2““,  in  the  sense  that 

(a)  PP(CS\Ri)  >  P*,  because  PP{CS\Ri)  >  and 

(b)  st  C  Sflma*,  because  ^(*0)  ^ 

Proposition  4.2:  (a)  The  subset  selection  rule  Ri  and  R^  will  take  the  same  action,  if  i[i]  =  •  •  •  = 
or  when  the  subset  selection  rule  Ri  selects  all  populations  in  its  selected  subset. 

(b)  The  subset  selection  rule  Ri,  R^  and  will  take  the  same  action,  if  the  subset  selection 
rule  iii  selects  only  one  population. 

Proposition  4.3:  The  subset  selection  rule  Ri  possesses  the  advantage  of  the  rule  R’^,  because 
the  forms  of  the  involved  integration  for  and  F*  are  identical. 

Remark  4.5:  The  selection  rule  Ri  is  like  a  modified  rule  of  where,  it  like  that  the  population 
associated  with  the  largest  statistic  possesses  the  probability  P*  of  a  correct  selection,  and  the 
remaining  |a|  -  1  populations  in  s  have  the  P(CS)  at  laest  equal  to  —  P*). 

5  An  Example  for  Comparsion  of  the  Several  Subset  Selection 
Rules 

A  data  set  of  exponential  random  numbers  generated  by  a  statistical  package  G6-RVP  designed  by 
H. Rubin  and  C. Hinkle  at  Purdue  University  was  given  in  Gupta  and  Leong’s  paper(1979),  where 
9  observations  for  each  of  5  sets  of  double  exponential  random  numbers  with  location  parameters 
0i  equal  to  0,2.5,3.4,— 2.0, —0.65  were  taken. 


13 


Hi 

Ha 

n4 

Hs 

-3.4839 

-0.9839 

-0.0839 

-5.4839 

-4.1339 

-2.6762 

-0.1762 

0.7238 

-4.6762 

-3.3262 

-0.3129 

2.1871 

3.0871 

-2.3129 

-0.9629 

-0.2264 

2.2736 

3.1736 

-2.2264 

-0.8764 

-0.1761 

2.3239 

3.2239 

-2.1761 

-0.8261 

0.1462 

2.6462 

3.5462 

-1.8538 

-0.5038 

0.3033 

2.8033 

3.7033 

-1.6967 

-0.3467 

1.6160 

4.1160 

5.0160 

-0.3840 

0.9660 

5.6924 

8.1924 

9.0924 

3.6924 

5.0424 

To  see  how  each  subset  selection  rule  performs,  let 

Xi  =  the  sample  mean  of  11, •  and  j/,-  =  sample  median  of  11,, 

then 

X  =  (ii,,..,!*)'  =  (0.0980,2.5980,3.4980,-1.9020,-0.5520)', 

E=  =  (-0.1761,2.3239,3.2239,-2.1761,-0.8261)'. 

Hence  the  difference  of  x,’s  and  y,’s  are  AX32  =  Ay32  =  0.90,  AX31  =  Ayai  =  3.40, 
Ax35  =  Aj/35  =  4.05,  AX34  =  Ay34  =  5.40. 

(a)  Now,  we  have  the  following: 


PP{CS\R,gs)  for  R  =  ,  Ri,  Ri{i  >  2) 

shen  on*  popiilation  is  picked 


using  mean 

using  median 

P®  or  Ri{i  >  2) 

0.9131 

0.9380 

or  Pi 

0.7700 

0.8292 

where,  in  the  case  of  the  sample  mean,  the  integration  for  R^  is 

P*  =  F9(u  +  0.9)  X  F9(«  +  3.4)  X  F9(u  +  4.05)  x  F^iu  +  5.4)dF9(u). 
J  -.00 

The  integration  for  R2  is 

P*  =  r  Fgiu  +  0.9)  X  P^(u  +  3.4)dP9(u). 

J —00 
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Also,  the  integration  for  Ri  or  is 


P"  =  r  i^9^w  +  0.9)dF9(«), 

J— oo 

where  F9(.)  is  the  cdf  of  the  sample  mean  of  size  9. 

The  same  applies  to  the  case  of  the  sample  median.  Note  that  the  rule  ilj  is  as  good  as  ii®. 

(b)  In  the  case  where  two  populations  are  taken,  we  have  the  probability  one  for  all  selection  rules, 
because 


r  F^iu  +  3.4)di9(u)  ~  r  Giiu  +  3.4)dG4iu)  ~  1, 

•/— oo  •/—oo 


where  (?4(.)  is  the  cdf  of  the  sample  median  of  size  9(see  Gupta  and  Leong  (1979)). 
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Table  1 :  Table  of  {c„}  and  for  n  =  2, 3,  •  •  • ,  10 


■ 

sample  size  n 

Ih! 

£ 

3 

4 

5 

6 

7 

8 

9 

10 

IE9 

a 

16 

96 

768 

7680 

92160 

1290240 

20643840 

371589120 

Cn,0 

1 

15 

945 

10395 

135135 

2027025 

34459425 

Cn,l 

1 

15 

945 

10395 

135135 

2027025 

34459425 

Cn,2 

1 

6 

45 

420 

4725 

62370 

945945 

16216200 

Cn,3 

1 

105 

1260 

17325 

270270 

4729725 

Cn,4 

1 

15 

210 

3150 

51975 

945945 

Cn,5 

1 

21 

378 

6930 

135135 

Cn,6 

1 

28 

630 

13860 

Cn,r 

1 

36 

990 

Cn,8 

1 

45 

Cn,9 

1 

16 


Table  2:  Table  of  {c„}  and  {on,.}  for  n  =  2,3, •••,10 


sample  size  n 


^9 

I 

3 

4 

5 

6 

7 

8 

9 

10 

B 

D 

16 

96 

768 

7680 

92160 

1290240 

20643840 

371589120 

Onfi 

2 

8 

48 

384 

3840 

46080 

645120 

10321920 

185794560 

On,! 

1 

5 

33 

279 

2895 

35685 

509985 

8294895 

151335135 

<*71,2 

1 

9 

87 

975 

12645 

187425 

3133935 

58437855 

<*71,3 

1 

14 

185 

2640 

41685 

729330 

14073885 

On, 4 

1 

20 

345 

6090 

114765 

2336040 

<*71,5 

1 

27 

588 

12558 

278019 

<*71,6 

1 

35 

938 

23814 

071,7 

1 

44 

1422 

071,8 

1 

54 

<*71,9 

1 

17 


Table  3:  1)  Upper  100(1  -  a)  Percentage  Points  of  ^„(Top  Entry); 

2)  Ao  =  shere  ^  is  the  Upper  100(1  —  o)  Percentage  Points  of 


Z* (Bottom  Entry). 


sample  size  n 

1  -  a 

3 

5 

7 

9 

11 

13 

15 

17 

19 

21 

0.750 

0.6050 

0.6321 

0.6440 

0.6507 

0.6550 

0.6580 

0.6602 

0.6619 

0.6632 

0.6643 

-.0825 

-.1102 

-.1249 

-.1343 

-.1409 

-.1459 

-.1499 

-.1531 

-.1557 

-.1579 

0.900 

1.2221 

1.2432 

1.2532 

1.2590 

1.2629 

1.2656 

1.2676 

1.2692 

1.2704 

1.2715 

-.0739 

-.1259 

-.1591 

-.1823 

-.1995 

-.2129 

-.2237 

-.2326 

-.2401 

-.2467 

0.950 

1.6372 

1.6385 

1.6395 

1.6402 

1.6408 

1.6412 

1.6416 

1.6419 

1.6422 

1.6424 

-.0369 

-.1025 

-.1484 

-.1815 

-.2066 

-.2264 

-.2426 

-.2560 

-.2675 

-.2774 

0.975 

2.0284 

2.0026 

1.9905 

1.9836 

1.9794 

1.9763 

1.9740 

1.9724 

1.9711 

1.9699 

.0144 

-.0632 

-.1210 

-.1637 

-.1967 

-.2229 

-.2443 

-.2623 

-.2778 

-.2910 

0.990 

2.5214 

2.4524 

2.4194 

2.3999 

2.3871 

2.3782 

2.3715 

2.3663 

2.3621 

2.3590 

.0976 

.0055 

-.0683 

-.1236 

-.1666 

-.2014 

-.2298 

-.2539 

-.2741 

-.2923 

0.995 

2.8821 

2.7759 

2.7246 

2.6941 

2.6746 

2.6599 

2.6495 

2.6410 

2.6343 

2.6294 

.1684 

.0659 

-.0195 

-.0842 

-.1355 

-.1758 

-.2099 

-.2387 

-.2625 

-.2844 

Table  4:  Upper  100(1  — a)  Percentage  Points  of  Z„  for  even  values  of  n 


1-0 

sample  size  n 

2 

4 

6 

8 

10 

12 

14 

16 

18 

20 

0.750 

0.5731 

0.6218 

EB 

0.6478 

0.6531 

0.6566 

0.6592 

EB 

0.6626 

0.6638 

0.900 

1.1986 

1.2350 

1.2489 

1.2564 

1.2611 

1.2643 

1.2667 

1.2685 

1.2698 

1.2710 

0.950 

1.6359 

1.6379 

1.6390 

1.6399 

1.6405 

1.6411 

1.6415 

1.6418 

1.6420 

1.6423 

0.975 

2.0563 

2.0125 

1.9955 

1.9867 

1.9814 

1.9777 

1.9751 

1.9733 

1.9717 

1.9705 

0.990 

2.5958 

2.4792 

2.4335 

2.4084 

2.3929 

2.3822 

2.3746 

2.3688 

2.3642 

2.3605 

0.995 

2.9944 

2.8174 

2.7466 

2.7075 

2.6831 

2.6666 

2.6544 

2.6453 

2.6379 

2.6318 

18 


Table  5:  Values  of  d/y/n  =  d{n,k^P*)  for  n,  ^  =  2,3,  •  •  • ,  10 


■ 

number  of  Populations  k 

n 

H 

2 

3 

■D 

5 

6 

7 

8 

9 

10 

0.75 

1.1462 

1.7849 

2.1575 

2.4258 

2.6365 

2.8104 

2.9584 

3.0874 

3.2015 

1 

0.90 

2.3972 

3.0504 

3.4336 

3.7083 

3.9234 

4.1002 

4.2503 

4.3809 

4.4964 

0.95 

3.2716 

3.9322 

4.3197 

4.5971 

4.8138 

4.9917 

5.1427 

5.2739 

5.3899 

0.99 

5.1910 

5.8612 

6.2549 

6.5350 

6.7538 

6.9333 

7.0853 

7.2180 

7.3343 

0.75 

0.9580 

1.3575 

1.6011 

1.7756 

1.9110 

2.0214 

2.1144 

2.1947 

2.2653 

2 

0.90 

1.7893 

2.1966 

2.4393 

2.6118 

2.7452 

2.8539 

2.9454 

3.0244 

3.0939 

0.95 

2.3470 

2.7550 

2.9962 

3.1670 

3.2992 

3.4069 

3.4975 

3.5758 

3.6447 

0.99 

3.5237 

3.9287 

4.1660 

4.3337 

4.4634 

4.5696 

4.6582 

4.7351 

4.8032 

0.75 

0.7379 

1.1187 

1.3251 

1.4666 

1.5740 

1.6605 

1.7328 

1.7949 

1.8493 

3 

0.90 

1.4421 

1.7960 

1.9924 

2.1281 

2.2316 

2.3152 

2.3853 

2.4455 

2.4983 

0.95 

1.8926 

2.2339 

2.4250 

2.5576 

2.6591 

2.7410 

2.8098 

2.8690 

2.9209 

0.99 

2.8096 

3.1318 

3.3142 

3.4417 

3.5394 

3.6189 

3.6855 

3.7427 

3.7932 

1 

0.75 

0.6478 

0.9796 

1.1575 

1.2784 

1.3696 

1.4428 

1.5037 

1.5558 

1.6013 

0.90 

1.2564 

1.5601 

1.7268 

1.8414 

1.9283 

1.9983 

2.0567 

2.1068 

2.1507 

0.95 

1.6399 

1.9298 

2.0909 

2.2020 

2.2866 

2.3548 

2.4119 

2.4608 

2.5038 

■ 

0.99 

2.4086 

2.6774 

2.8290 

2.9344 

3.0150 

3.0802 

3.1351 

3.1820 

3.2234 

0.75 

0.5842 

0.8821 

1.0407 

1.1480 

1.2286 

1.2931 

1.3466 

1.3923 

1.4322 

5 

0.90 

1.1280 

1.3980 

1.5454 

1.6462 

1.7224 

1.7836 

1.8346 

1.8783 

1.9165 

0.95 

1.4673 

1.7236 

1.8650 

1.9623 

2.0362 

2.0956 

2.1452 

2.1877 

2.2249 

0.99 

2.1401 

2.3752 

2.5071 

2.5983 

2.6682 

2.7246 

2.7719 

2.8121 

2.8477 

Table  5  (continued) 


6 

0.75 

0.90 

0.95 

0.99 

0.5361 

1.0320 

1.3396 

1.9453 

0.8088 

1.2777 

1.5718 

2.1533 

0.9532 

1.4110 

1.6992 

2.2734 

1.0507 

1.5022 

1.7864 

2.3555 

1.1237 

1.5707 

1.8530 

2.4170 

1.1819 

1.6256 

1.9058 

2.4668 

1.2301 

1.6714 

1.9504 

2.5078 

1.2713 

1.7106 

1.9885 

2.5430 

1.3072 

1.7450 

2.0215 

2.5752 

■ 

0.75 

0.4983 

0.7514 

0.8850 

0.9748 

1.0420 

1.0955 

1.1397 

1.1775 

1.2103 

B 

0.90 

0.9575 

1.1843 

1.3071 

1.3906 

1.4535 

1.5038 

1.5457 

1.5814 

1.6126 

I 

0.95 

1.2408 

1.4542 

1.5713 

1.6513 

1.7119 

1.7605 

1.8010 

1.8356 

1.8658 

■ 

0.99 

1.7952 

1.9874 

2.0951 

2.1694 

2.2258 

2.2714 

2.3095 

2.3423 

2.3708 

0.75 

0.4675 

0.7045 

0.8295 

0.9132 

0.9758 

1.0255 

1.0666 

1.1017 

1.1321 

8 

0.90 

0.8969 

1.1086 

1.2230 

1.3006 

1.3590 

1.4057 

1.4444 

1.4775 

1.5064 

0.95 

1.1609 

1.3596 

1.4683 

1.5426 

1.5987 

1.6436 

1.6810 

1.7130 

1.7410 

0.99 

1.6750 

1.8530 

1.9526 

2.0211 

2.0731 

2.1152 

2.1504 

2.1804 

2.2068 

0.75 

0.4419 

0.6658 

0.7835 

0.8623 

0.9210 

0.9677 

1.0063 

1.0391 

1.0676 

9 

0.90 

0.8468 

1.0461 

1.1535 

1.2263 

1.2811 

1.3248 

1.3610 

1.3920 

1.4189 

0.95 

1.0950 

1.2816 

1.3835 

1.4530 

1.5055 

1.5475 

1.5825 

1.6123 

1.6384 

0.99 

1.5764 

1.7430 

1.8358 

1.8997 

1.9482 

1.9874 

2.0200 

2.0480 

2.0726 

0.75 

0.0015 

0.5712 

0.7179 

0.8051 

0.8665 

0.9135 

0.9516 

0.9835 

1.0110 

10 

0.90 

0.7512 

0.9761 

1.0869 

1.1593 

1.2126 

1.2547 

1.2893 

1.3188 

1.3443 

0.95 

1.0141 

1.2070 

1.3078 

1.3752 

1.4255 

1.4655 

1.4986 

1.5269 

1.5515 

0.99 

1.4865 

1.6476 

1.7362 

1.7968 

1.8428 

1.8796 

1.9103 

1.9365 

1.9596 

20 
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